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Abstract 

Maximum likelihood estimation and a test of fit based on the Anderson-Darling 
statistic is presented for the case of the power law distribution when the parameters 
are estimated from a left-censored sample. Expressions for the maximum likelihood 
estimators and tables of asymptotic percentage points for the A 2 statistic are given. 
The technique is illustrated for data from the Dow Jones Industrial Average index, 
an example of high theoretical and practical importance in Econophysics, Finance, 
Physics, Biology and, in general, in other related Sciences such as Complexity Sci- 
ences. 
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1 Introduction 



The statistical model known as the power law distribution has important appli- 
cations in Natural, Social, Economical and also in Computing Sciences. Some 
related phenomena involving power law distribution are: scaling, universality, 
criticality, phase transitions [1,2,3], fractals [4], complex networks [5,6], earth- 
quakes [7], size of files in a computer system [8], World Wide Web Topology 
[9,10], Information Theory [11], financial indexes and assets price variations 
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[12,13], individual income distribution [14,15], and many more. Physicists are 
attempting to produce a general theory in order to explain, under an uni- 
fied point of view, all this phenomena and the mechanism that drive them to 
produce a power law distribution. The main candidate seems to be currently 
known under the general name of Complex Systems Science or Complexity 
Sciences. Some preliminary progress has been achieved in this direction in the 
context of Economical Complex Systems [16,17]. 

From the above said, it can be seen that the importance of the applied statisti- 
cal problem of fitting a power law distribution is of great and current interest. 
However, difficulties involved in performing a fit of a power law to empirical 
data are many and very subtle (exponent sensitivity to a big number of data 
entries, to the selection of the cut off parameter, discrimination of spurious 
power law distributions, etc.) and many related and very interesting papers 
proposing new fitting methodologies, criticizing other or proposing alterna- 
tive statistical models to the power law to describe empirical data have been 
recently published [18,19,20,21,22,23,24,25,26,27,28]. 

This paper can be considered as a proposal to formalize and establish a solid 
statistical procedure to perform a good power law fit in the statistical sense, by 
means of considering an approach based on the statistical theory of estimation 
and tests of fit from left censored samples. In section 2 the problem is reviewed 
and formalized, introducing the required technical aspects; in section 3 the 
statistical fitting procedure is described; in section 4 the class of quadratic 
statistics and the Anderson-Darling A 2 statistic are defined; in section 5 we 
present the calculations for the asymptotic percentage points used to evaluate 
the goodness of the fit and in section 6 results from a simulation study to 
investigate the speed of convergence of the calculated asymptotic percentage 
points to those observed empirically are shown. Section 7 illustrates the fitting 
procedure with empirical data by means of an example from finance of a high 
theoretical and practical importance in Econophysics, for financial scientists 
and traders: a power law fit for the tails of a set of daily variations of a financial 
index as the Dow Jones Industrial Average (DJIA). Finally, conclusions are 
presented in section 8. 



2 Preliminaries and Maximum likelihood estimation 

A random variable Y is said to follow a power law distribution, if its cumulative 
distribution function (CDF) is given by 



F(y;a,9) = 1 - [-) fory > 9, a > 0, 9 > 0. (1) 
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and it is required to test goodness of fit to the tails of this distribution using the 
largest values from the sample of their returns. This situation can be viewed 
performing estimation and goodness of fit under type II left censoring, which 
corresponds to the case in which, for fixed r, the n — r smallest observations 
are missing, so the estimation and test procedures are based on the largest 
r sample values y( n _ r+ i), y( n _ r+ 2), . . . , y( n ) where denotes the i— th order 
statistic in a sample of size n from an absolutely continuous distribution F. 



Given that the Anderson-Darling A 2 is known to be a powerful statistic for 
detecting departures in the tails from the hypothesized distribution, it becomes 
necessary to obtain its asymptotic distribution when the parameters of the 
distribution have been estimated based on a left censored sample. Details on 
maximum likelihood estimation for the case of complete samples for the Power 
law distribution can be found, for instance, in [17] and [28], appendix B. 

The log-likelihood for a left-censored sample y( n _ r+ i), . . . , y( n ) from the distri- 
bution (1), is given by 



l(a, 9) = (n—r) In 



1 - 



9 



+r\n(a)+ra ln(6 1 ) — (a + 1) lny( 



n—r+i) 



i=l 



The maximum likelihood estimators (MLE) of the parameters a and 9, are 
the solution of the equations 



dl(a,9) r * 

- + r In (9) - m y( n -r+i) 
i=i 



da a 

— (n — r) 
= 



9 



9 



dl(a,6) . J 9 
— — — = - {n - r ) 

d9 \S/(n-r+l) 

= 



In 

y(n-r+l)J \y(n-r+l), 

—^-Ya9- 1 I 



1 - 



9 



e 



J/( n _ r+ l) / 



ra 

+ T 



(2) 



(3) 



It is easy to verify that 



a = r 



ln y(n-r+i) -rln y(n- r +l) 



i=i 



0= (r/n) 1/a y ( „_ r+ i) 



(4) 
(5) 
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Let us denote Fisher's information matrix by 



I(a,9) 



-E 



hn°,o) ^(m) 

3&'(M) &*(M) 



Asumming that the proportion of censoring q = 1 — r/n remains constant as 
n — > oo, we obtain the following limiting expressions: 



V = lim -I (a, 6) 

n— >oo 77, 



(l-q)\q+(\n(l-q)) 2 ] (1-g) ln(l-g) 
a 2 q 9 q 



(l-q)ln(l-g) 
6q 



Q 2 (l-g) 
qd 2 



Thus, the asymptotic variance-covariance matrix of the maximum likelihood 
estimators will assume the form n^V -1 , where 



V 



o? ln(l-g)g 



l-q 



1-1 



\n(l-g)e [g+(ln(l-g)) 2 ]8 2 
1-9 a 2 (l-9) 



-1 



For the case of complete samples (i.e. q — 0) the estimate ^ 
efficient in the sense that its asymptotic variance is 0(n~ 2 ), so n 

= 0; in fact, for n > 2, Var(^) = 6 2 na [(an- 2) (an - l) 2 
cal applications this means that the asymptotic distributional 
identical to the case when the parameter 6 is known. On the other hand, 
Var(d) = n~ l lim g ^ « 2 / (1 — q) — a 2 /n. 



is super- 
1 lim^ Var(^) 

. For practi- 
results will be 



3 Test procedures 



Suppose that we are interested in testing the null hypothesis that the random 
sample yi, . . . , y n , was drawn from the distribution (1), based on the r largest 
observations. The test can be performed as follows: 

(1) Find the maximum likelihood estimators a and 9 of the parameters a 
and in (1), using formulas (4) and (5). 

(2) Obtain the order statistics y( n -r+i) < • • • < V(n) an d compute = 
F y( n -i+iy, &,§ for i = 1, . . . , r. 

(3) Compute the Anderson-Darling in its version for a type II left-censored 
sample 
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A l,r = H( 2 ^ _1 ){ ln l-^(n-i+i) - lnz (n _ i+ i)} - 2 ^ln2;( n _i + i) 

n i=i «=i 

[(r - nf In z (n _ r+ i) - r 2 In (l - 2( n _ r +i)) + n 2 (l - z {n - r +i)) 



(4) Using the value q = 1 — r/n, the proportion of left-censoring, refer to 
table 1. If the value of the test statistic exceeds the value in the table, 
for a given significance level, reject the null hypothesis. 



Table 1 

Upper percentage points of the asymptotic distribution of the A 2 n r statistic, for 
selected censoring proportions. 

Censoring proportion Significance level 



q = 1 — r/n 


U.lo 


U.1U 


U.Uo 


n no k 


U.U± 


0.00 


0.9123 


1.0588 


1.3181 


1.5873 


1.9554 


0.05 


0.7364 


0.8566 


1.0695 


1.2905 


1.5925 


0.10 


0.6354 


0.7388 


0.9217 


1.1114 


1.3706 


0.15 


0.5584 


0.6489 


0.8087 


0.9743 


1.2005 


0.20 


0.4950 


0.5748 


0.7157 


0.8616 


1.0607 


0.25 


0.4406 


0.5114 


0.6361 


0.7652 


0.9414 


0.30 


0.3928 


0.4557 


0.5663 


0.6808 


0.8368 


0.35 


0.3500 


0.4058 


0.5039 


0.6054 


0.7436 


0.40 


0.3111 


0.3606 


0.4474 


0.5372 


0.6594 


0.45 


0.2755 


0.3191 


0.3957 


0.4748 


0.5825 


0.50 


0.2425 


0.2808 


0.3480 


0.4173 


0.5117 


0.55 


0.2118 


0.2451 


0.3036 


0.3639 


0.4460 


0.60 


0.1830 


0.2118 


0.2621 


0.3141 


0.3847 


0.65 


0.1559 


0.1804 


0.2232 


0.2673 


0.3273 


0.70 


0.1303 


0.1507 


0.1864 


0.2231 


0.2731 


0.75 


0.1060 


0.1226 


0.1515 


0.1813 


0.2219 


0.80 


0.0829 


0.0958 


0.1184 


0.1417 


0.1732 


0.85 


0.0608 


0.0703 


0.0868 


0.1039 


0.1270 


0.90 


0.0397 


0.0459 


0.0567 


0.0677 


0.0828 


0.95 


0.0195 


0.0225 


0.0278 


0.0332 


0.0405 
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4 Quadratic statistics and asymptotic theory 



The Anderson-Darling A 2 statistics belongs to a class of discrepancy measures 
of the form 

[F n (x)-F(x; 0)} 2 ^(x)dF(x;0) 

-oo 

known as quadratic statistics, where F n denotes the Empirical Distribution 
Function (EDF) of a random sample from an absolutely continuous 

distribution F, 6 denotes a vector parameter and ip is a weighting function. 

When ip(x) = 1, the resulting statistic is the well known Cramer- von Mises 
W 2 . In order to put more weight in the tails, the Anderson-Darling A 2 statistic 
is obtained for ij>(x) = {[F(x; 0)] [1 - F(x; 0)}}' 1 . 

In this section, the process of finding the asymptotic distribution of the EDF 
statistics, is briefly summarized. For a more detailed treatment of the sub- 
ject, the reader is referred to the works by D'Agostino, Durbin and Stephens 
[29,30,31]. 

The asymptotic theory for doubly censored samples with known parameters, 
has been given in Pettitt and Stephens [32]. Pettitt [33] modified the theory 
in Durbin [30] for testing normality from censored samples with parameters 
estimated by maximum likelihood. Here, these results are used to obtain the 
asymptotic distribution for the power law distribution under type II censoring. 
In the following, it will be assumed that the proportion censored, q — 1—r/n, 
remains constant as n tends to infinity. 

Let 6 denote the maximum likelihood estimator of the vector parameter 0, 
with estimates where necessary. For a singly left-censored sample, the process 
yjn ^F n (x) — F(x; 0)}, evaluated at t — F(x; 0), converges weakly to a Gaus- 
sian process {Y(t) : t G (q, 1)} with certain covariance function p(s,t). The 
limiting distribution will depend on the functional form of F, and on which 
parameters are being estimated. 

The statistic A 2 is asymptotically a functional of the process Y(t); namely, 
A 2 converges weakly to / a 2 (t)dt where a(t) = Y(t) Jt(l — t) . Y(t) and 

Jq 

a(t) are both Gaussian processes defined in (q,l), with covariance functions 
p(s, t) and p a (s, t) = p(s, t) [(s — s 2 )(t — £ 2 )]~\ respectively, for q < s, t < 1. 

It is known that the limiting distribution (see for example, Durbin [30]) is that 
of where vi, . . . are independent chi-square random variables with 

one degree of freedom, and A*, . . . are the eigenvalues of the integral equation 

f p*(s,t)f t (s)ds = \*f t (t) (6) 

Jq 



6 



where p* denotes the covariance function corresponding to the limiting process 
on which the test statistic is based; in our case, p a (s,t). 



5 Asymptotic percentage points 

In samples from the power law distribution defined in (1), with a left-censored 
proportion q, the limiting Gaussian process was found to have a covariance 
function given by 

p(s, t) = mm(s, t) - st - (1 - s) (1 -t)(l- q)~ l [In (1 - s) In (1 - t) 
- In (1 - t) In (1 - q) - In (1 - s) In (1 - q) + q + In 2 (1 - q) 

Therefore the asymptotic distribution of the Anderson-Darling statistic will 
not depend on the particular values of the parameters 6 and a. 

Also, for q = 0, when the full sample is available, 

p(s, t) = min(s, t) - st - (1 - s)(l - t) ln(l - s) ln(l - t) 

The asymptotic points were found numerically using 400 points in (g, 1) to 
approximate the integral and solve (6). In a 400 x 400 grid, the appropri- 
ate covariance function was evaluated and the eigenvalue problem solved for 
different values of q, the proportion of censoring, ranging from 0.05 to 0.95, 
with increments of 0.05 units. These eigenvalues were then used to calculate 
the asymptotic percentage points using Imhof's method [34]. The results are 
shown in table 1. The row corresponding to q = denotes the asymptotic 
percentage points for complete samples. 



6 Small sample distributions 

A simulation study was performed to investigate the speed of convergence of 
the empirical percentage points to the asymptotic ones, for both statistics and 
considering different censoring proportions. For each combination of q and n, 
ten thousand pseudo-random samples from the distribution (1) were generated 
and the statistic A 2 r n was then calculated to estimate the empirical percentage 
points. The results are shown in table 2. The standard errors of the estimated 
percentage points, say £i- a , for a given significance level a, were approximated 
using the asymptotic expression SE{^\- a ) = — \\J a ^°^ \ where N denotes 
the number of simulations (in this case, N = 10000) and g is the density 
of the simulated test statistic. The density at each point can be estimated by 
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approximating the derivative of the empirical cumulative distribution function 
using two adjacent percentage points. 

Table 2: Empirical percentage points of the statistic A^ r for 
selected censoring proportions and sample sizes. The esti- 
mated standard errors are shown within parenthesis. 



Censoring Significance 
proportion level 



q = 1 — r/n 


n 


0.15 


0.10 


0.05 


0.01 


0.050 


100 


0.7237 


0.8464 


1.0406 


1.5848 






(0.88xl0^ 2 ) 


(0.12X10- 1 ) 


(0.30X10- 1 ) 


(0.14X10- 1 ) 




300 


0.7318 


0.8434 


1.0450 


1.5880 






(0.80xl0~ 2 ) 


(0.12X10- 1 ) 


(0.30X10- 1 ) 


(0.14X10- 1 ) 




oo 


0.7364 


0.8566 


1.0695 


1.5925 


0.100 


100 


0.6485 


0.7525 


0.9404 


1.3762 






(0.74xl0^ 2 ) 


(O.llxlO- 1 ) 


(0.24X10- 1 ) 


(O.llxlO- 1 ) 




300 


0.6269 


0.7244 


0.9053 


1.3509 






(0.70xl0- 2 ) 


(O.llxlO- 1 ) 


(0.24X10- 1 ) 


(O.llxlO- 1 ) 




oo 


0.6354 


0.7388 


0.9217 


1.3706 


0.250 


100 


0.4582 


0.5292 


0.6541 


0.9693 






(0.51xl0- 2 ) 


(0.75x10-2) 


(0.17X10- 1 ) 


(0.78x10-2) 




300 


0.4419 


0.5129 


0.6400 


0.9568 






(0.51xl0~ 2 ) 


(0.76x10-2) 


(0.17X10- 1 ) 


(0.79x10-2) 




oo 


0.4406 


0.5114 


0.6361 


0.9414 


0.500 


100 


0.2470 


0.2840 


0.3550 


0.5184 






(0.26xl0- 2 ) 


(0.43x10-2) 


(0.89x10-2) 


(0.41x10-2) 




300 


0.2459 


0.2838 


0.3526 


0.5080 






(0.27xl0- 2 ) 


(0.41x10-2) 


(0.85x10-2) 


(0.39x10-2) 




oo 


0.2425 


0.2808 


0.3480 


0.5117 


0.750 


100 


0.1052 


0.1209 


0.1516 


0.2247 






(0.11x10-2) 


(0.18x10-2) 


(0.40x10-2) 


(0.18x10-2) 




300 


0.1058 


0.1238 


0.1517 


0.2168 






(0.13x10-2) 


(0.17x10-2) 


(0.35x10-2) 


(0.16x10-2) 
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Table 2 — continuation 



q — 1 — v In 


n 


n 1 c 
U. ±o 


U.1U 


u.uo 


n ni 

U.Ul 




oo 


U.1U0U 


U.1ZZO 


U.1D1D 


n 001 n 


u.yuu 


1UU 


O.OooO 


0.0413 


0.050o 


0.074y 






^U.ooXlU J 


in \y 1 n— 3\ 

(U.DDX1U J 


in iQvi n— 2 \ 
^U.loXlU J 


(U.OUX1U J 




OUU 


U.Uooo 


U.U410U 


u.uo^y 


U.UOU41 






(U.44X1U J 


in c^ri\/ 1 n - 3\ 

(U.Dyxiu ) 


in 1 yi \/ 1 n _ 2\ 
(U.14X1U J 


(U.OOXIU J 




oo 


u.uoy i 


n n/i 
u.u4ioy 


U.UOO I 


U.UoZo 


U.yoU 


1 nn 
1UU 


n ni no 

u.uiyy 


U.Uzzb 


n no7Q 
U.Uz / o 


U.Uoyo 






(0.19xl0- 3 ) 


(0.28x10-3) 


(0.66xl0~ 3 ) 


(0.30xl0~ 3 ) 




300 


0.0197 


0.0228 


0.0282 


0.0422 






(0.22xl0~ 3 ) 


(0.32xl0~ 3 ) 


(0.76xl0~ 3 ) 


(0.35xl0~ 3 ) 




OO 


0.0195 


0.0225 


0.0278 


0.0405 



These results seem to suggest that the speed of convergence of the empirical, 
to the asymptotic percentage points, does not depend on the proportion of 
censoring, at least in a significant way. Thus, the asymptotic percentage points 
can be used with good accuracy for moderately large n. 



7 An example taken from Finance: Fitting the tails of the Dow 
Jones Index Daily Variations 

In order to illustrate the technique^"*"! we consider the series consisting of 5001 
standardized returns computed from the daily closing values of the Dow Jones 
Industrial Average Index (DJIA). The data includes values from January 1st, 
1990 to November 3rd, 2009 and its file can be obtained in www.yahoo.com. 
The histogram for the set of standardized returns is shown in figure 1. 

Suppose that we are interested in performing a power law fit based on the 
largest r = 385 positive values of the standardized returns. Using formulas 
(5) and (4) we obtain 9 = 0.528 and a = 2.387. The calculated value of the 
Anderson-Darling statistic is A 2 nr = 0.18 which exceeds the value in table 1 
for a proportion of censoring q = 0.85 and significance level 0.01. It is then 

1 A program to perform the analysis described here, is available on request to 
hcoronel@uv.mx. Later upgraded versions, will be available in a more formal web 
site. 
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Fig. 1. Histogram for standardized returns. It has 5001 entries. 

concluded that the power law model should be rejected with a probability 
value p < 0.01. See figure 2a. 

A second fit based on the r = 257 largest positive returns, gives 9 = 0.622, 
a = 2.728 and A 2 nr = 0.027 which is less than the 0.15 critical point for a 
censoring rate q = 0.90. In this case the power law fit is not rejected with an 
associated probability value p > 0.15, so the fit is considered good. Figure 2b 
shows the resulting fit. 



a) 385 largest returns b) 257 largest returns 




123456789 10 123456789 10 

Standardized Returns Standardized Returns 



Fig. 2. Fitted (solid line) and empirical (dashed line) cumulative distribution func- 
tions (CDF) for: a) The largest 385 positive returns, b) The largest 257 positive 
returns. 

This procedure can be applied to the left tail, considering the absolute values 
of the 2436 negative returns. A test based on the largest r = 244 values, 
corresponding to a censoring proportion q = 0.90, gives 9 = 0.625, at = 2.562. 
The value A^ r — 0.077 gives a probability value 0.01 < p < 0.025, indicating 
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strong evidence against the power law fit shown in figure 3a. If we now consider 
a censoring proportion q = 0.95, the results for r = 122 are 9 = 0.785, 
a = 3.066 and A 2 nr = 0.009; the associated probability value is now p > 0.15, 
indicating a very good fit. This fit is shown in figure 3b. 



a) 244 largest returns b) 122 largest returns 




234567 234567 
Standardized Returns Standardized Returns 



Fig. 3. Fitted (solid line) and empirical (dashed line) cumulative distribution func- 
tions (CDF) for: a) The largest 244 absolute negative returns, b) The largest 122 
absolute negative returns. 

8 Conclusions 

Applying the theory in Durbin [30], the percentage points of the asymptotic 
distribution of the Anderson-Darling A 2 statistic were obtained numerically 
and tables for testing goodness of fit for the power law distribution, when the 
parameters are estimated from a left-censored sample, were provided. Results 
from a simulation study showed that a test of fit for this distribution can 
be performed with good accuracy using the asymptotic percentage points for 
moderately large samples. It was also found that the speed of convergence of 
the empirical to the asymptotic percentage points, does not show a significant 
dependence on the censoring rate. 

Given that the test is based on the Anderson-Darling A 2 statistic, which puts 
more weight in the tails of the distribution, the resulting test appears to be 
demanding as it can be concluded from the cases described in the example. 
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